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Preface 

Trying to decompose an integer into a product of integers, we feel irritation. There should 
dwell the reason why any prime appears like a real gem that one can touch and hold. We thus 
muse ever and again how and when ancient people discovered the way of sifting out primes 
and began appreciating them. Perhaps those who conceived the divisibility had already some 
sieves in their minds. Indeed, a wealth of evidences have been excavated supporting our view. 
The story to be told below must have originated more than five millennia ago 1 - 1 , while the 
primordial intellectual irritation has remained fresh and fundamental till today. 

The history of the Sieve Method is rich and fascinating; we would need a volume to 
exhaust the story. In the present article we shall instead concentrate on several pivotal 
ideas that made progress possible; so the scope is inevitably limited. Nevertheless, you will 
encounter instances of precious mathematical achievements that people in the future will 
certainly continue to relate. 

Notes are to be read as essential parts, although they are in the style of personal mem- 
oranda. Mathematical symbols and definitions are introduced where they are needed for the 
first time, and will continue to be effective until otherwise stated. Theorems are given some- 
what implicitly, and details such as domains of variables are to be induced from the context. 
References are restricted mostly to seminal works in respective developments. Basic facts 
from Analytic Number Theory could be found in the monographs [26] [64]. 

Remark: This is a translation of our Japanese expository article that was published under 
the title 1 An overview of sieve methods' in the second issue of the 52nd volume of Sugaku, the 
Mathematical Society of Japan, April 2005. At this opportunity we have made some revision 
and changed the title into something more appropriate. Also, it should be noted that events 
in the last two decades are left untouched except for a few. 



Chapter 1. Brun's Sieve 

1.1 Mark any natural number that is divisible by the first prime 2, and repeat the same with 
all other primes less than a given z > 2. Then any natural number less than z 2 that remains 
unmarked is either 1 or a prime in the interval [z, z 2 ), since such an integer does not have 
two or more prime factors. This is a version of the well-known sieve method named after 
Eratosthenes of Alexandria 2 - ) . 

Eratosthenes' Sieve might appear to be quite effective, especially when z is large, for it 
allows us to expand the table of all primes less than z to that of those less than z 2 . However, 
if we look into the quantitative aspect of the method or if it is required to count the number 
of integers unsifted, then Eratosthenes' Sieve becomes virtually ineffectual 3 -*. Viggo Brun 
[12] confronted the challenge of improving Eratosthenes' Sieve to turn it into a quantitatively 
effective device, and became the founder of the modern theory of the Sieve Method 4 ). Let 
us see how he brought the first light [10] into the darkness of 2100 years. 
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1.2 Let P(z) be the product of all primes less than z, and let (m, n) be the greatest common 
divisor of integers m, n. That an integer n does not have any prime factor less than z is 
equivalent to (n, P(z)) = 1. Thus, the characteristic function of the set of all such integers is 
expressed as 

(1.1) £ /*(d), 

d|(n,P(z)) 

where /it is the Mobius function. This coincides with the above procedure of marking integers. 
In fact, if n has r marks, then the value of the last sum is equal to (1 — l) r . Hence (1.1) could 
be identified with Eratosthenes' Sieve 5 ) . 

We consider a finite sequence A of integers, and put S(A, z) = {n G A : (n, P(z)) = 1}. 
Then (1.1) implies that 

(1.2) \S(A,z)\= ^ n((£)\Ad\, A d = {n E A : n = O(modd)}- 

d\P{z) 

Thus, in order to either evaluate or bound \S(A, z)\, we need to have certain information 
about the behaviour of \Ad\ with variable d; and following a general practice we write 

(1.3) \A d \ = ^X + R d , u(d)>0, 

where u is a multiplicative function. One may regard (u(d)/d)X as the main term and R d as 
the remainder; for instance, X can be seen as an approximation to \A\. The terms R d should 
be small either individually or in a certain sense of mean. Inserting (1.3) into (1.2), we have 

(1.4) \S(A,z)\=V(z,oj)X + R(A,z), 
where 

(1.5) V(z, u) = J] (l - ^) , R(A, z) = J2 ^ d ) R i- 

P<z ^ P ' d\P(z) 

Hereafter p denotes a generic prime. 

1.3 The identity (1.4) does not have much realistic contents, however. To see this, we consider 
the problem of counting primes in a given interval. Thus, let x be sufficiently large, and 
put A = {n : x — y < n < x}, 2 < y < x/2. With this, we have to = 1, X = y, R d = 
[—(x — y)/d] — [—x/d] — y/d, where [a] is the integral part of a. The above discussion gives 

(1.6) tt{x) - tv(x -y) = F(v^, l)y + R(A, y/x), 
where n(x) is the number of primes less than x as usual. Since 6 ) 

(1.7) V(z,l) = j (l + O ( ) ) (c E : the Euler constant), 

log z V \\ogz) ) 



Sieve Method and its History 



3 



we get 

(1.8) n(x)-n(x-y) = 2e~ CE (1 + o(l))-^— + R(A^). 

logx 

On the other hand, the Prime Number Theorem suggests that 

V 



(1.9) tt(x) - tt{x - y) = (1 + o(l)) 



logx 



Thus it is plausible 7 ) to have R(A, y/x) ~ (1 — 2e~ CE )(y/ logx). That is, each of the two 
terms on the right of (1.6) can not be the main term or the remainder term either. It seems 
extremely hard 8 ) to deduce this fact directly from the definition of R(A, y/x) . 

1.4 As another example, we shall consider TC2(x) the number of twin primes less than x. This 
time we work with the sequence A = {n(n + 2) : 1 < n < x — 2}; in fact, |»S(»4, y/x)\ = 
n 2 (x) + 0(y/x). We have u{2) = 1, oo(p) =2 (p > 3), and by (1.4) and (1.7) it follows, after 
a rearrangement, that 

(1.10) n 2 (x) = 8e" 2 -(l + o(l))-^^ ft ( X " + R ^ ^ 

It is, however, much more difficult to deal with this R(A, y/x) than the analogue in the last 
section. In a conjecture due to Hardy and Littlewood [23] the asymptotic identity ^(x) ~ 
2Ca;/(loga;) 2 is predicted, where C is the Euler product on the right of (1.10). Thus, again, 
each of the two terms on the right of (1.10) cannot be the main term or the remainder term 
either, and the expression (1.10) does not stand for anything meaningful for the Twin Prime 
Conjecture. 

1.5 The difficulties with R(A, z) observed above stems from the fact that the number of 
summands in the defining sum (1.5) may be too big to handle. As z increases, the factors 
of P(z) can become huge, and moreover their number too. Whether it is possible or not 
to detect any dramatic cancellation among the summands should obviously be tremendously 
difficult to see. Here is the reason for the limitation of Eratosthenes' Sieve. 

In 1915 Brun [10] broke this spell with a surprisingly simple idea. He threw the explicit 
formula (1.1) out, and replaced it by an inequality bounding the characteristic function from 
above and below so that he could gain an effective control over the size of participating factors 
of P{z). In other words, he moved the sieve theory from the classic world of exactness to the 
modern world of reserved certainty. More explicitly, his idea is embodied in 

(1.11) ^ d )^ E E ^ 

d\(n,P(z)) d\(n,P(z)) d\(n,P(z)) 

v(d)<2e+l v{d)<2t 

with v(d) the number of different prime factors of d; in fact we have, for any i > 0, 

(1.12) E ^) = (-D'(" (m '- 1 ). 

d\m \ 1 / 

v{d)<t 
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The inequality (1.11) is often called Brun's Pure Sieve. 

1.6 Let us try Brun's idea in the instance of the twin prime problem; we are going to bound 
n 2 (x) from above. Let z < y/x and £ be to be fixed later. We apply (1.11) to the A of Section 
1.4, and have 

tt 2 (x) < \S{A,z)\ + z 

(1.13) <x £ 4f^ + 5> (d ) + z, 

d\P(z) " d<z 2i 

v(d)<2l 

in which we have used \Rd\ < uj(d). Compared with the sum of divisors, the second sum over 
d is 0(£z 2i logz). The difference between the first sum and V(z,lo) is 

(1.14) - yj m^ <2 -« e ^ ; 

d\P(z) " d\P(z) 

v{d)>2i+l 

the symbol <C indicates in general that the absolute value of the left side is less than a 
constant multiple of the right side. The last sum is O (V(z, l) -4 ). Collecting these and 
setting z = exp(logx/(1001oglogx)), £ = [log a;/ (4 log z)], we obtain 9 ) 

n 1c; n /x„ /loglogx\ 2 tv 2 (x) (loglogx) 2 

(1.15) 7T 2 {x) < x — ■ or — — - < 



logx / ir(x) logx 

Therefore, twin primes occur far less frequently than ordinary primes. It is literally hopeless 
to deduce this fact via (1.10). It is amazing that the imperfect (1.11) could yield any result 
that the exact (1.1) would never be capable of 10 ). 

1.7 Brun's Sieve, the title of the present chapter, is an improved version [12] of his Pure 
Sieve; a combinatorial sophistication was introduced into the choice of divisors in (1.11). It 
enabled Brun to achieve the impressive bound 

(1.16) n 2 (x) < 



(logx) 2 ' 

In view of Hardy-Littlewood's conjecture mentioned above, this should be the best possible, 
save for the implied constant. The construction of Brun's Sieve is, however, so intricate that 
we have to skip the details 11 -*, and state only the conclusion (1.22) below, moreover in a 
considerably abridged fashion. Nevertheless, we shall reach an equivalent result in Section 
3.5 via a somewhat different argument. 

For the present and later purpose, we need to rearrange the specifications introduced in 
[12], to be in accordance with today's practice. Thus, it is customary to suppose that it holds 
that for any 2 < z\ < z 2 

V(z\,u) -p-r / ^(p)\ _1 (\ogz 2 



V(z 2 ,uj) } L V V J \ lo g^i/ V V lo g^i 
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with a k > 0, the dimension of the sieve problem under consideration 12 \ This is to be 
compared with (1.7). For example, in Section 1.4 the Twin Prime Conjecture is discussed as 
a sieve problem of dimension 2. On the other hand, the inequality (1.11) is understood as a 
particular construction of sieve weights p r (d) such that for any integer n 

(1.18) (-iy\ J2 V(d)~ £ Kd)Pr(d)\>0, p r (l) = l, 

(d\(n,P(z)) d\(n,P(z)) J 

with the convention p ri = p r2 (r\ = T2 (mod 2)). We have, because of (1.3), 

(1.19) (-l) r {\S(A, z) | - V(z, u ; Pr )X} > (-l) r R(A, z;p r ), 
where 

(1.20) V(z,u;p r ) = Kd)Pr(d)^, R(A,z;p r )= Kd)Pr(d)R d . 

d\P(z) d\P(z) 

According as r = 0, 1 (mod 2), the inequality (1.19) is called the lower and the upper bounds of 
\S(A, z)\; occasionally only one of them is considered. Naturally, we shall regard V(z, u ; p r )X 
as the main term, and R(A, z ; p r ) as the remainder. In order to have any effective control 
over the latter, we ought to impose a limitation to the size of participating d. For this sake 
we introduce the condition 

(1.21) p r (d) = 0, d>D, d\P(z). 

The parameter D is called the level of the sieve weights p r . A sieve problem is to find p r that 
yield any good main term under these specifications 13 \ 

We now exhibit the principal result of Brun [12], with a drastic simplification: Let 
D = z T . Then there exist characteristic functions p r such that 

(1.22) V(z,u;p r ) = (l + 0(e-^ Tl0ST ))V(z,u). 

1.8 Let us apply the last assertion to the situation treated in Section 1.4. We set z T = y/x; 
then the remainder term can obviously be ignored. With a sufficiently large r, we get the 
assertion 



(1.23) \{n<x: ifp|ra(ra+2) then p < x x /^}\ 



x 



(log x) 



2 ' 



This implies not only (1.16) but also that there exist infinitely many pairs (n, n + 2) of integers 
such that the number of prime factors of each is less than 2r. 

In this way, Brun [12] accomplished the first definitive advance toward the Twin Prime 
Conjecture. If we use instead his Pure Sieve, then it would be required to set £ = [25 log log x] 
and thus r log log x. Hence Brun's Sieve is far stronger. Certainly the same could be 
asserted about Goldbach's Conjecture 14 ); we need only to move to the sequence {n(N — n) : 
3 < n < N — 3} with an even integer N. 
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1.9 It is possible to treat the Twin Prime Conjecture as a one dimensional or a linear sieve 
problem. To see this, let ip be the Euler totient function and li the logarithmic integral; and 
put 

(1.24) tt(x; k, I) = \{p < x : p = I (mod k)}\, E(x;kJ) = Tc(x;k,l) —lix. 

cp{k) 

The latter is called the remainder term in the prime number theorem for arithmetic progres- 
sions. To the sequence A = {p + 2 : 3 < p < x}, we apply Brun's Sieve; thus X = lix, 
u){2) = 0, iv(p) = p/(p — 1), (p > 3), and k, = 1 as well as Rd = E(x;d, — 2). Let r be 
sufficiently large as before. Then we have 



(1.25) \S(A,z)\ = (l + 0(e-^ Tl0ST ))V(z,iv)-lix + 



( \ 

\E(x;d,-2)\ 

. d<z T 

\ 2td / 



This time, the issue is how to deal with the second term on the right side. We assume that 

(1.26) V max \E(x;q,a)\ <^ X A > 3, 

^ a(modg) (logx) A 
9<Q (a,q)=l 

Then we have 

(1.27) | {p < x : p + 2 does not have prime divisors less than Q 1 ^} \ 



x 



(log Q) (log x) 

For example, the Extended Riemann Hypothesis would allow us to set Q = ^/(logx) A+1 ; 
and 

(1.28) \{p : v{p + 2) < 2r + l}| = oo. 

Goldbach's Conjecture could be treated in much the same way. Comparing this with the 
assertions in the previous section, we apprehend the strength of the Extended Riemann 
Hypothesis 15 \ 

At (1.25)-(1.26) is observed a typical instance of the relation between a sieve problem 
and the distribution of the relevant sequence of integers among arithmetic progressions. That 
is, a sieve problem upon a particular sequence of integers is reduced to the discussion of the 
distribution of the elements of the sequence among arithmetic progressions with variable mod- 
uli. One may see that if the decomposition into arithmetic progressions is made too fine, then 
relevant moduli could become too large to handle the remainder term in the sieve effectively. 
Also it might be conceivable that relatively small moduli could contribute substantially. This 
situation reminds us of the Circle Method of Ramanujan and Hardy 16 ); that is, there appears 
to exist a relation between sieve problems and the Farey sequence. We shall encounter the 
same in the next section but with a somewhat different context. 

1.10 Brun's Sieve yields remarkable assertions not only about those great conjectures but 
also about fundamental queries in the theory of the distribution of primes: 

(1.29) tt(x) -Ti(x-y) < {2<y<x-2), 

(1.30) tt(x;/M)« , (2<k<x/2). 

ip(k) \og(x/k) 
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In fact, (1.29) follows immediately from an obvious modification of the discussion in Section 
1.3. On the other hand, as to (1.30) we may certainly assume that (k,l) = 1; then for the 
sequence A = {n < x : n = I (mod k)} the specification (1.3) holds with X = y/k, \Rd\ < 1; 
u)(p) = as p\k, and u(p) = 1 as (p, k) = 1. In particular, we have V(z, u>) < (k/<p(k))V(z, 1). 
The rest is straightforward. The assertion (1.30) is called traditionally the Brun-Titchmarsh 
theorem. 

As a matter of fact, if the uniformity as displayed prominently in (1.29)-(1.30) is required, 
any currently available analytic method which relies on the theory of the Riemann zeta and 
the Dirichlet L-functions is unable to produce anything comparable to the last two assertions. 
Even under the Extended Riemann Hypothesis, the situation would not change. It is indeed 
amazing that such an elementary idea as Brun's could ever take us close to the very subtlety 
of the distribution of primes. 

Chapter 2. Linnik's and Selberg's Sieves 

2.1 Some 20 years had passed since Brun's fundamental work [12] when Ju. V. Linnik [41] 
marked a new departure in the Sieve Method; and in a few years A. Selberg [68]-[70] made 
an independent leap. Selberg's idea was a distinctive incision into the sieve theory general. 
The construction of his sieve, i.e., his sieve weights, is fundamentally different from Brun's; 
thus he brought a structural change into the Sieve Method. On the other hand, Linnik's idea 
would later be appreciated because not only of its highly effective sieve effect but also of the 
argument itself that he employed. With Linnik's seminal work, a general principle was born, 
which has been and is still a driving force behind many of major works in Analytic Number 
Theory. This is now called the Large Sieve, in a much wider context than the title of his work 
indicated then. 

We shall show the essentials of the two ideas. Also, a duality relation between them will 
be disclosed. Further, it will be witnessed that Large Sieve yields not only a sieve bound but 
also a spectacular conclusion about the distribution of primes in arithmetic progressions. 

2.2 We need first to change the technical definition introduced in Section 1.7 of a sieve problem 
into a conceptual setting. Thus, let Cl(p) be a set of residue classes modp, and let us write 
n 6 0(p) in stead of n (modp) G O(p). We put 

(2.1) S(A, z;Q) = {n E A:n £ ft(p), Vp < z] , 

with any sequence A of integers 17 ' . A sieve problem is newly defined to be the estimation of 
\S(A, z ; 0)|, though we shall consider mainly the situation where A is the interval 

(2.2) M = [M, M + Af)nz, MeZ.NeN. 

In Section 1.4, the Twin Prime Conjecture is treated with M = 0, N = [x—2], 0(2) = {0}, 
O(p) = {0, —2} (p > 3), |0(p)| = ui(p). In this example, we have |0(p)| < 2; but in general 
we should not restrict the size of O(p). However, if we have, for instance, always |0(p)| > cp 
with a certain fixed c > 0, Brun's Sieve is not applicable, for the condition (1.17) on the sieve 
dimension is violated. Then, is there any sieve procedure that is effective even when |0(p)| 
may become huge like that? It was Linnik [41] who gave the first answer to this intriguing 
problem. 
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2.3 Linnik argued as follows: Let w be the characteristic function of the set {n e Z : n ^ 
Vp < z}, and put 



(2.3) 



£7(0) = ro(n)exp(27rm0), 

M<n<M+7V 

U(9;p,a) = ra(n) exp(27rin0). 

M<n<M+iV 
n=a (mod p) 



Then we have 



(2.4) 



p-i 

E 

a=l 



p 



pJ2\U(0;p,a)\ 2 -\U(0)\ : 



a=l 



On the other hand, since U{0;p, a) = if a G O(p), p < z, we have also 



(2.5) 



\U{9)\< 



a=l 



<(p-\n(p)\)Y,\U(e;p,a)\ 2 , p<z, 



a=l 



OI- 



lS) 



(2.6) 



\U{9)\ 2 MP)1 



p-Mp)\ 



< 



p-i 

E 

o=l 



*7 + 



P 



p < z. 



By the decomposition law of residue classes, we have 



(2.7) 



^n^* e 



p\q 



a (mod q) 
(a,q) = l 



U 



, q\P(z). 



In this way, we are led to 19 ) 



(2.8) |S(A/-,z;Q)| 2 G^,Q)<]T £ 



<?<2 a ( mod g) 
(a,g) = l 



zu(n) exp f 2ni-n 



M<n<M+JV 



where 
(2.9) 



Q<z p\q 



Mp)\ 



p-\n( P )\ 



Hence, our initial problem has been reduced to the estimation of the sum in (2.8) over 
the Farey sequence. We skip the relevant discussion by Linnik himself, since we shall show 
an assertion better than his, in Section 2.6. What is essential here is to follow the procedure 
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leading to (2.8), which is termed Linnik's Sieve 20 ). Nevertheless, the final assertion should 
be displayed: By the inequality (2.30), we have 

(2-10) \S(Af,z;Q)\< c^( N + z2 )- 

2.4 Selberg argued as follows: We extend Q multiplicatively, and let A be an arbitrary real- 
valued function such that A(l) = 1. We have, for any integer n, 



(2.11) 



w(n) < 



E A ( rf ) 

\ d\P(z) J 



E A(di)A(d2), 



nen([d 1 ,d 2 ]) 
di,d 2 |P(«) 



where [di,^] is the least common multiple of di, d 2 . This inequality is trivial; nevertheless, 
its consequence is impressive. 

For the sake of simplicity, we assume that X(d) = either if d > z or if /j,(d) = 0. 
Summing (2.11) over n e H and exchanging the order of summation, we have 



(2.12) 
where 

(2.13) 



S 



\S{N,z;VL)\ <N-S + R, 



dl ,d 2<z ^ d2 ^ 

R\< E M{d 1 ,d 2 ])\\\(d 1 )\(d 2 )\. 

d 1 ,d 2 <z 

Selberg computed the minimum of the quadratic form S with the side condition A(l) = 1. 
His reasoning is illuminating 21 \ We note first 



(2.14) 



\n{[d 1 ,d 2 ])\ |fi(dx)| \n{d 2 )\ {d u d 2 ) 



d 1 



d 5 



|n((di,d2))|' 



[di, d 2 ] 

and by the Mobius inversion 

^4^ = E^n^-l^)|), KdMd*)*o. 



(2.15) 

Thus 
(2.16) 



|n((di,da))| ^ W)l pi 

/|d2 



/<* ' UJ ' P\f d<z 

d=0 ( mod /) 



E 



d 



X(d). 
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The inversion of the linear transform A i — > ^ is given by 

(2-17) A(d) = Mtidg). 

' { )l 9<z/d 

The side condition A(l) = 1 is thus transformed accordingly, and 

< 2 - 18 > S ^ + ES I> - mwi) («/) - c^w^c/. n > 

/< 2 pi/ 

Hence the optimal £ is found, and inserting it into (2.17) we are led to the assertion that 

(d,ff)=l 

Obviously the assumption on A imposed above is satisfied by this specialization. Moreover, 
we have 

(2.20) \X(d)\ < fi(d) 2 . 
In fact, we have, for any d < z (n(d) ^ 0), 

(2.21) G(z,n) = J2v(f) 2 H(f,n) J2 n(g) 2 H(g,n), 

f\d 9<z/f 
(d,g) = l 

from which (2.20) follows immediately. Collecting these, we find that 

(2.22) |SCA/\z;f2)|< + ^, \R\ < (^£\n(d)^J . 
The procedure of the present section is called Selberg's Sieve 22 \ 

2.5 A comparison of (2.22) with (2.10) might cause the incorrect impression that Selberg's 
Sieve is inferior to Linnik's 23 ). In fact, (2.10) could be deduced with Selberg's Sieve as well 24 \ 
In order to show this, we express the characteristic function of the set {n e Z : n e f2(d)} 

as 



\ E E exp(27ri(n-/i)^) 

(2.23) 



d 

a ( mod a) hEQ(d) 



^E E E exp(-27ri^J •exphTr^nj 

q\d a (mod q) \h£Q.{d) X / 

(a,q)=l 
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Inserting this into (2.11), we have 
(2.24) 



\SW,z;(l)\< J2 

M<n<M+N 



E E 

q<z a (mod q) 
(a,q)=l 



6 1 — ] exp ( 27ri-n 



where 
(2.25) 



6 (!)= E 



d<z 
d=0 ( mod q) 



d 



E ex p 



a , 

-2ni-h 



Applying the inequality (2.31) below to the right side of (2.24), we get 



(2.26) 



\S(Af,z;Q)\<(N + z 2 )J2 E & (±) 



q<z a (mod q) 
(a,q)=l 



The double sum is equal to 

\(d 1 )\(d 2 : 



(2.27) 



E 

di ,d 2 <z 



did 2 



E E E E exp^Trz-^-^)), 

h 1 eCl(d 1 ) h 2 efl(d 2 ) q\(d 1 ,d 2 ) a (mod q) ^ 

(a,q) = l 



which coincides with S above, as can readily be seen by observing the multiplicativity of the 
construction. Hence we have 



(2.28) 



\S(N,z;Q)\ < (N + z 2 )-S. 



By the argument of the previous section, up to (2.19), we obtain (2.10) again. 

Therefore the important upper bound (2.10) has been proved in two ways. There is an 
obvious duality between them. It should be interesting to know that there is such an intrinsic 
relation between Linnik's and Selberg's ideas which occurred independently. By the way, the 
inequalities (2.8) and (2.24) are typical instances of applications of the Large Sieve. 

2.6 We now exhibit the fundamental inequality of the Large Sieve: Let {ip m } be a finite set 
in a Hilbert space equipped with the inner product (,). Then we have, for any ip in the 



space, 
(2.29) 



25) 



eW%M^<<^>- 



From this, a set of useful inequalities follow. Among them, the following two are utilised in 
the above: Let {9 r } be a sequence in the unit interval, whose elements are well separated 
with the minimum distance 5 > (mod 1). Then we have, for any complex vectors {a n }, 



(2.30) 



E 



a n exp(2nin6 r ) 



M<n<M+N 



<(iv-i+5- 1 ) Yl 



M<n<M+N 
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and 

(2-31) £ 

M<n<M+N 



6 r exp(27rm0 r ) 



< (A^-l + 5- 1 )^|6, 



2 

r | i 



where the interval [M, M + N) is as in (2.2). The latter is a consequence of the former, and 
vice versa. This is due to the well-known fact that the norms of a bounded linear operator 
and its adjoint acting in a Hilbert space are equal to each other. 

2.7 Hence, as far as finite intervals are concerned, Linnik's and Selberg's Sieves give rise to 
the same upper bound (2.10). Here are a few assertions that are consequences of (2.10): 

(2.32) tt(x) - ir(x - y) < 2(1 + o(l)) U 



logy ' 



(2.33) ^, M < 2(1 + 0(1)) _!_ 

(2.34) n{x ) < 16(1 + 0(1)) g (l " j^p) ^ 



gx) 



2 ' 



These can be proved, via (2.10), with the corresponding M and O, with z = (N/logN) 1 / 2 . 
The asymptotic evaluation of G(z, O) should not cause any difficulty. Another interesting 
application could be obtained with f2(p) being the set of all quadratic non-residues (modp). 

Thus, it is understood that as far as upper bounds are concerned Linnik's and Selberg's 
Sieves are superior to Bran's 26 - 1 . For instance, the bound (2.34) should be compared with 
the Hardy-Littlewood conjecture mentioned above. Also, the new form (2.33) of the Brun- 
Titchmarsh theorem draws special attention. This is because of the following fact: If there 
exist two absolute constants a, (3 > with which we have, uniformly for (k,l) = 1, 

(2.35) n(x;kj)< 2(1 -a) X k < x 13 , 

(p{k) log{x/k) 

then Dirichlet L- functions L(s,x), x(mod/c), should not have any exceptional zero; so 
the theory of the distribution of primes in arithmetic progressions would fundamentally be 
improved 27 ). Also, an effective lower bound, which is essentially best possible, would follow 
for the class numbers of imaginary quadratic number fields. In this context, it should be 
noted specifically that the critical bound 

(2.36) 7i(x;kJ)<2 



(f(k) log (x/k) 

has been established via a more careful application of Linnik's Sieve 28 - 1 . 

2.8 The above discussion might be termed as an account of the additive Large Sieve, for it 
concerns additive characters as is indicated by (2.30)-(2.31). We have seen the appearance 
of important upper bounds in the prime number theory. In the present section we turn to 
an account of the multiplicative Large Sieve, concerning instead Dirichlet characters; and we 
shall see that there emerges a surprising assertion on the asymptotic theory of the distribution 
of primes in arithmetic progressions. More precisely, the multiplicative Large Sieve opens a 
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way to avoid the Extended Riemann Hypothesis 29 ) . This fascinating theory was inseminated 
by A. Renyi [65] 30 ). In order to appreciate his contribution, we need to review briefly the 
history of the theory of the distribution of primes. 

Primes in Short Intervals: Under the Riemann Hypothesis, the asymptotic formula (1.9) 
holds with, e.g., x 1 / 2 (logx) 3 < y < x/2. However, G. Hoheisel [24] established, without any 
hypothesis, the asymptotic formula upon the condition x^ < y < x/2, where $ is a positive 
absolute constant less than 1. That was the unprecedented event in which was discovered the 
possibility to avoid the Riemann Hypothesis; and it was the beginning of the modern theory 
of the distribution of primes. At the core of Hoheisel's argument is a statistical study of the 
distribution of the complex zeros of the Riemann zeta-function ((s) or a statistical proof of the 
Riemann hypothesis 31 \ due to H. Bohr and E. Landau [3]. In Riemann's explicit formula for 
the function tt(x) there is a sum over the complex zeros, into which Hoheisel introduced the 
statistical study, along with a certain zero- free region on the left of the vertical line Res = 1. 
What should not be missed to observe in Bohr-Landau's theory, especially in the context 
of our present discussion, is the role of a version of the mean values of ((s). Taking later 
developments into account, this concerns the analysis of 



(2.37) I \C(\ + ^t)\ 2 \Y, 



n 

n<N 



a n n lt 



2 

dt, 



with N, T > 1 and complex a n which are to be chosen appropriately 32 - ) . 

Least Prime Number Theorem: If one wants to establish an analogue of Hoheisel's assertion for 
primes in arithmetic progressions, then the study of n(x; k, I) ((/c, /) = 1) should be developed 
on the supposition x^ < x/k with a new positive absolute constant $ less than 1. This is to 
look for a way to avoid the Extended Riemann Hypothesis. Obviously an extension of (2.37) to 
Dirichlet L-functions L(s, x) (x (mod k)) is required; but this part of the theory does not cause 
any essential difficulty, for it suffices to exploit the orthogonality of the characters. However, 
the theory of the distribution of zeros of Dirichlet L-functions lacks what corresponds to 
the zero-free region of ((s) mentioned above. One has to find a way to negate this defect, 
which certainly requires to develop the statistical study of the zeros in a far refined fashion 
than the followers of Bohr and Landau did. It was Linnik [42, I] who overcame this genuine 
difficulty 33 ''. There a definitive role was played by the Brun-Titchmarsh theorem (1.30) 34 -*. 
Further, the possibility of exceptional zeros caused another difficulty, or a quantitative study 
of the Deuring-Heilbronn theory had to be developed. That was achieved by Linnik in [42, 
II] 35 ). In this way the Least Prime Number Theorem was established; that is, there exists 
an absolute constant c > such that the least prime in every reduced residue class mod k is 
less than k c . 

Mean Prime Number Theorem: Thus Linnik found a way to avoid the Extended Riemann 
Hypothesis. It concerned, however, a single modulus, though the uniformity on it was of 
course maintained. Thus the next target was to find a way to avoid the Extended Riemann 
Hypothesis simultaneously for all moduli in an arbitrary finite range. This time, a genuine 
difficulty took place in extending (2.37), which is the principal difference from Linnik's situa- 
tion. It was Renyi [65] who resolved this difficulty. He started with Linnik's fundamental work 
[41], and developed a version of the multiplicative Large Sieve to extend (2.37) to a double 
sum over moduli and characters, analogously involving Dirichlet L-functions and polynomials 
twisted by characters. 
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Then, he could establish (1.26) for Q = x a with an absolute constant a > 0, without 
any hypothesis; this is Renyi's Mean Prime Number Theorem 36 ). Hence, as can be seen from 
(1.25)-(1.27), Renyi superseded Brun and made a great step toward the Twin Prime and the 
Goldbach Conjectures. 

Naturally, efforts afterward were concentrated on the improvement 37 - ) of Renyi's theorem; 
that is, to find larger Q. Finally, after 17 years of a series of struggles, it was established that 



(2.38) 



The inequality (1.26) holds with Q 



(logx) J 



where B is a function of A. This is called E. Bombieri-A.I. Vinogradov's Mean Prime Number 
Theorem 38 K Bombieri's argument [4] stands on the tradition of the Large Sieve; and Vino- 
gradov [77] relied on the Dispersion Method [43] , another fundamental invention of Linnik 39 ) . 
Despite the difference in their methods, what they achieved is essentially equivalent to each 
other and to the consequence of the Extended Riemann Hypothesis, especially in the context 
of its applications to sieve problems as exhibited above. In Bombieri's argument 40 \ (2.38) 
could be said to be a consequence of the inequality 



(2.39) 



ip(q) ^ 

q<Q ^ W x(mod q) 



E a nX(n) 

M<n<M+N 



< (iV-l + Q 2 



E 



i i 2 



M<n<M+N 



where the asterisk means that the sum is restricted to primitive characters. Connecting mul- 
tiplicative characters with additive characters via Gaussian sums, (2.39) follows immediately 
from (2.30). 

2.9 In this section we shall look into the relation between the Large Sieve and Selberg's Sieve, 
in a perspective different from the above; we shall show that the multiplicative Large Sieve 
can be amalgamated with Selberg's Sieve. As the discussion of the previous section suggests, 
such an extension of the multiplicative Large Sieve has consequences in the theory of the 
distribution of primes in arithmetic progressions 41 ''. This aspect should not be unexpected, 
especially if it is taken into account that an origin of Selberg's Sieve can be traced back to 
(2.37). In fact, an initial version of Selberg's procedure developed in (2.13)-(2.19) could be 
found in his argument to compute the minimum value of the expression (2.37) under the side 
condition a\ = 1 42 - ) ; that is, the extremal values of a n are found in much the same way as 
those of \{d). 

Returning to (2.19), the optimal A is written as 



(2.40) 



E A ( rf ) 



n 

p\q 



-1 



H(p,Q) 



We compare this with (2.30), and ponder upon the norm of the linear operator (ip q (n,Q)), 
with 



(2.41) 



%j) q (n, O) = n(q)y/H(q,Q)V q (n, O). 
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In this way, we find that for any complex vectors {a n }, {b q } 43 ) 
(2.42) 



q<z 



^2 a n ifj q (n,Q.) 

M<n<M+N 



(2.43) 



M<n<M+N 



q<z 



<(N-l + z 2 ) 

M<n<M+N 

<(N-l + z*)J2\b q \ 2 . 

q<z 



a I 2 



Setting a n = w{n) in (2.42) we get (2.10) again; and (2.43) implies (2.10) as well, for it 
contains (2.28). More generally, the operator ^x(n)(k/(p(k)) 1 ' 2, ip q (n, O)) could be viewed in 
the same way, where x is primitive mod k, and (k,q) = 1, kq < z. That is, Selberg's Sieve 
and the multiplicative Large Sieve could be hybridized, which yields interesting refinements 
of (2.10). 

This does not exhaust the flexibility hidden in Selberg's Sieve. An aspect in which 
Selberg's Sieve supersedes Linnik's is in that the class of sequences to which the former is 
applicable is definitely wider than that with the latter. For instance, with a given arithmetic 
function / one may consider the quadratic form 44 ) 



(2.44) 



n=l 



\ 



. d\n 
\d<z 



J 



on the side condition A(l) = 1. The optimal A thus obtained yields an analogue of the 
above t(j q (n,Q). It can be used to extend the multiplicative Large Sieve, which in turn 
has an important application; that is, a highly simplified proof of the Least Prime Number 
Theorem. In the previous section we stressed the role played by the Brun-Titchmarsh theorem 
in Linnik's proof of his Least Prime Number Theorem. There the Sieve Method was somewhat 
hidden. In the new proof, the Sieve Method emerges as the protagonist, and leads the whole 
story 45 ) . 



Chapter 3. Rosser's Sieve 

3.1 In the present chapter we shall return to the circle of Brun's ideas. Being combinatorial 
in its nature, Brun's Sieve demands efforts to comprehend. On the other hand, Selberg's 
Sieve is simple and powerful; also Linnik's Sieve gave rise to the principle of the Large Sieve, 
which brought a tremendous impact to the development of the theory of the distribution 
of primes. Perhaps because of this, it took considerably long time for Brun's theory to be 
appreciated and shared by many. In fact, it was 30 years later since his work [12] when Rosser 
(ca. 1950) opened a way leading to the complete settlement of the linear sieve 46 -*. Namely, he 
discovered a choice of sieve weights on the general condition introduced in Section 1.7 (with 
k = 1), which gives best possible main terms in both the upper and lower bounds. Moreover, 
the construction of his sieve weights is relatively simple. In what follows we shall describe 
the salient points of Rosser's Sieve, especially his Linear Sieve. We shall employ symbols 
and definitions introduced in Chapter 1, without mention. We stress that we shall start with 
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S(A, zq) (2 < zo < z) instead of A. The reason why we first sift A with primes less than a 
certain zq will become apparent in the course of discussion. 

3.2 To get a lower bound of the size of a subset in a finite set, it suffices to have an upper 
bound of its complementary subset. To wit, lower bounds could result from upper bounds. 
This trivial principle was first exploited effectively by A. A. Buchstab [13], in the context of 
the Sieve Method. More explicitly, his idea relies on the following identity: Classifying the 
elements of S(A, zq) \ S{A, z) according to their least prime factors, we get 

(3.1) \S(A,z)\ = \S(A,z )\- \ S (A,P)\- 

zo<P<z 

This logical identity is named after Buchstab. If we put zq = 2 and iterate the identity 
infinitely, we get (1.2). It is, however, useless in general, and thus Brun introduced a system 
of restricting the participating divisors of P(z). 

Any restriction of the divisors is the same as to attach the wight or 1 to each divisor. 
With this observation in mind, we reconsider (1.2). Thus, let r\ be an arbitrary function with 
77(1) = 1, and rewrite (3.1) as 

(3.2) \S(A,z)\ = \S(A,z )\- V(P)\S(A P , P )\- £ (1 - V (p))\S(A p ,p)\ 

Z()<P<Z z <p<z 

This is the case with £ = 1 of the identity 

\S(A,z)\= »(d) P (d)\S(A d ,z )\ + (-iy P(d)\S(A d ,p(d))\ 

d\P(z ,z) d\P(z ,z) 

v(d)<£ u(d)=l 

(3.3) 

+ Kd)v(d)\S(A d ,p(d))\. 

d\P(z ,z) 
v(d)<£ 

Here P(z , z) = P(z)/P(z ), p(l) = 1, a(l) = 0, and for d = pip 2 ---pi (pi > pi > • • • > pi) 

(3.4) p(d) = r)(p 1 )r)(p 1 p 2 )---r}(p 1 p2---pi), a(d) = p(d/p(d)) - p(d) (p(d)=p l ). 

To prove (3.3), we apply to (3.2) the replacements A 1— > A d , z 1— > p(d), i](p) 1— > r}(dp), and 
insert the result into (3.3); then we get £ 1— > £+ 1. Hence, setting £ > ir(z) in (3.3), we obtain 

(3.5) \S(A,z)\= v(d) P (d)\S(A d ,z )\+ £ p(d)a(d)\S(A d ,p(d))\, 

d\P(z ,z) d\P(z ,z) 

which is an extension or rather a refinement of (1.2). 

3.3 For the sake of simplicity, we impose the restriction < rj(d) < 1 for any d\P(z); thus, 
< p(d) < 1, < a(d) < 1. With this, we shall try to derive from (3.5) as sharp as possible 
upper and lower bounds of \S(A, z)\. First, we observe trivially 

(3.6) (-iy I \S(A,z)\- J2 »(d)p(d)\S(A d ,z )\ i < Yl v(d)\S(A d ,p(d))\. 

( d\P(zo,z) J d\P(z ,z) 

v(d)=r (mod 2) 
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There is a way to have the equality here; that is, a(d) = {v{d) = r + 1 (mod 2)). Thus we 
set 

(3.7) 77(d) = 1 = r + 1 (mod 2)); 

and such an 77 is denoted as 77 r , and correspondingly we define p r and ay. Then, we have 

(3.8) \S(A,z)\= J2 v(d)Pr(d)\S(A d ,z )\ + (-l) r £ a r (d)|5(^,p(d))|. 

d|P( 20 ,z) d|P(z ,«) 

Discarding the second sum, we get 

(3.9) (-l) r l\S(A,z)\- »(d)Pr(d)\S(A d ,z o )\\>0. 

( d\P(z ,z) J 

Also we have, corresponding to (3.8), 

(3.10) V(z,u) = V(z ,u)Vo(z,u;p r ) + (-iy ]T a r {d)^-V{p{d),u), 

d\P(z ,z) 

where 

(3.11) V (z,u;p r ) = V{d)pr{d)^f-. 

d\P(z ,z) 

In fact, expanding out the product V(z,lu)/V(zq,u}) and classifying the resulting terms in 
Buchstab's fashion, we obtain an identity analogous to (3.1). The rest of discussion is the 
same as above. In passing, we note that Vo(z, iv; p r ) = V(z, uj; p r ), provided zq = 2. 

3.4 In deriving (3.9) from (3.8) we brought in a certain inaccuracy, which should certainly be 
evaded as much as possible 47 K For this sake, we note the trivial but crucial fact that |<S(^4, z) \ 
is a non-increasing function of z. Thus the negligence of S(Ad,p(d)) with p(d) which is small 
for Ad causes most likely a relatively large loss. To avoid this we should better set a r (d) = 
for such d. One of the most fruitful device to make explicit the smallness of p(d) for Ad 
is to introduce two parameters (3 > 1 and D > 0, and to define p{d) to be small for Ad 
if p(d) < (D/d) 1 / 13 . Behind this criterion is the concept of the Sieving Limit, but at this 
moment there is no particular necessity to know the details 48 -* . 
Hence, in addition to (3.7), we impose 

(3.12) »(«0 = {J $$>A (,M = r(mod2)). 
Then, p r and o> are, respectively, the characteristic functions of the sets 49 -* 

(3.13) V{p r ) = {1} U {rf : p lP2 ■ ■ ■p2k+r-lP2kir < D , I < 2k + r < l} , 

(3.14) V{a r ) = [d : Pr (d/p(d)) = 1, Pl p 2 ■ ■ -pi- lP ^ +1 >D,l = r (mod 2)} , 
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with r = 0, 1, and d being the same as in (3.4). 

With the sieve weight p r (d) thus constructed, the formula (1.18) is called Rosser's Sieve. 
In fact, the validity of (1.18) is immediate in view of (3.9) with zq = 2. Moreover, because 
of the supposition Q > 1 the level condition (1.21) is fulfilled with the present D. It should 
be noted that as D and (3 are taken larger and smaller, respectively, the set V(a r ) becomes 
narrower; that is, the loss caused at the step (3.9) should decrease. 

3.5 As an application of Rosser's Sieve, we shall prove Brun's theorem (1.22) briefly 50 ). In 
this section, we work with z = 2. We note first that if z 2 < D, then we have 

1 (B- i\^ d)/2 

(3.15) 2\JTl) l °sD<log(D/d) (deV(p r )). 

In fact, if r = 1, u(d) = 2£, pi(d) = 1, then we have p 2 j+2 < P2j+i < (D/(p 1 p 2 ■ ■ •p2j)) 1/(/3+1) 
(0 < j < £-1). Thus, {{(3-1)/ {(3+1)) \og{D/{ PlP2 ■ ■ -p 2j )) < \og{D/{ Pm ■ • -^+2)), which 
gives (3.15). Other cases are analogous. Next, in the second sum of (3.10), the terms are 
classified according to the values of ^(<i), and taking (1.17) into account we see that the sum 
is 



(3.16) 




where q = mm u ^)=ip{d), I = mmv(d) with d\P{z) and a r (d) = 1. By the definition (3.4), 
p r (d) = 0, and thus p{d)@d > D, which gives i > r — (3 because z T = D. On the other hand, 
we have p r {d/p{d)) — 1, and by (3.15) 

1 //?_ 1 \ t^)" 1 )/ 2 

(3 ' 17) 2 \J+l) l ° gD < log(D/d) +lo %P^ ^ ()»+ l)logp(d), 

which gives a lower bound for q. Inserting these assertions on £ and q into (3.16), and setting 
3 = r/3, we reach (1.22) after some elementary estimation. 

3.6 As a matter of fact, it is known that if k > 1, then the upper bound via Rosser's Sieve 
is inferior to that via Selberg's Sieve 51 -*. Nevertheless, if k = 1, then Rosser's Sieve yields 
optimal upper and lower bounds as has been stressed above. Since the linear sieve problems 
include great conjectures, Rosser's construction of his Linear Sieve is extremely important 52 \ 

We first show his assertion: With (3 = 2, we fix Rosser's sieve weights p r {d); and let 
the functions </v(t), with (j) ri = <p r2 {r\ = r2(mod2)), satisfy the difference-differential 
equation 53 ) 

(3.18) ^{rMr)) = <Pr + i{r-l) (r > 2), 

(3.19) T(/>i(t) = 2e CB , 0o(r) = (0 < r < 2). 
Then we have, for z = D 1 ! 1 ', zq = exp((logD)/(loglog£>) 2 ), 



(3.20) 



V{z ,u>)V {z,uj;p r ) = {l + o{l))(t> r {T-)V{z,uj) (k = 1). 
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With this, we apply Brun's Sieve (1.22) to the term \S(Ad, zo)\ appearing in (3.9), and 
find that 

(3.21) (-l) r {\S(A,z)\-(l + o(l))Mr)V(z,u)X}>(-l) r Kd)S r (d)Rd, 

d<D 
d\P(z) 

in which D = z T ', and S r is a certain characteristic function 54 \ This is called Rosser's Linear 
Sieve. 

The main steps of the proof of (3.20) are as follows: With (3 > 1, which is to be fixed 
later, we construct Rosser's sieve weights p r (d), and put 

(3.22) V(z,u)K(z,p r ) = max{0,V(z ,uj)Vo(z,uj;p r )}. 

Then we assume that there exist continuous functions k r such that K(z, p r ) = (l + o(l))/c r (r); 
we have < ko(r) < 1 < fei(r) by (3.10). Also, we assume 55 ) that = inf{r : k (r) > 0}. 
On noting the definitions (3.11) and (3.13), we have 

(3.23) Vb(z,u;;pi) = Vo(z ,u;p{) - — ^-Vb(p, u; pi). 

z () <p<z ^ 

Here the level of the Rosser sieve weight Pq is equal to D/p; that is, pi = p in (3.13). If 
Pl (p) = 1, then p < L>V(£+i). Thus? if > £> ? then Vb(z,u;;pi) = y (£> 1/(m) , w; pi). 
In view of (1.17) (k = 1), we have rfci(r) = (/3 + l)fci(/3 + 1) (r < + 1). If r > /3 + 1, 
then \og(D/p)/ \ogp > (3; thus, by our assumption on /?, we may write F (^o, ^) Vb(p, cu; pg) = 
V(p, uj)K{p, Pq). Hence, we find that 

(3.24) y^, W )fc 1 (r) = y(^o,^)fci(ro)- (l + o(l)) £ ^y(p,u,)fc (£ P ), 

with Zq° = D and £ p = (log D)/ (log p) — 1. We apply (1.17) to (3.24), and express the result 
in terms of a Stieltjes integral. We are led to the integral equation 

(3.25) T h(r ) - Th(T) = J ° k (ti - 1R. 

This ends the discussion on the case r = 1. The other case could be treated analogously. The 
equation that ko should satisfy is (3.18) if r > /?; and if r < /3, then it is (3.19) but with the 
constant 2e CE being replaced by {j3 + l)ki((3 + 1). From this, it follows readily that 

(3.26) k r (r) = 1 + (-If ^^09 + (/? + 1) (l + O (^)) + O (j^y) • 

In view of Brun's theorem (1.22), we find the optimal value of j3; that is, we should set (3 = 2. 
Then, 3&i(3) = 2e CE follows. In this way we reach (3.18)-(3.19). 

Having this, we set k r = <j> r , and go from (3.25) back to (3.24); then we see that (3.24) 
holds in fact with k± = 4> r and ko = 4> r +i- By the definition (3.19) one may multiply each 
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summands on the right by the factor p r (p)- The identity thus obtained can be iterated in 
much the same way as in Section 3.2. We get 

(3.27) V(z,oo)Mr) = (l + o(l))V(z ,u J ) £ ^(d)p r (d)^<p r+Hd) ( l °f D/d) ) . 

d \PM d V l ° gZo J 

Comparing this with the definition (3.11) and on noting r (r) = 1 + 0(1/T(t)) ((3.26), 
(3 = 2), our problem is reduced to the estimation of the expression 



(3-28) Pr(d)^e W (- 



log(D/d) \ 
log^o / 



d\P(z ,z) 

We skip the details, but this can be seen to be negligible, which ends the proof of (3.20). 
3.7 The extremal situation 56 ) that implies that Rosser's Linear Sieve is optimal is given by 

(3.29) = {n<x: [the total number of prime divisors of n] = r mod 2} . 
We have X = x/2 and ou = 1. Rosser's Sieve {(3 = 2, D = x) gives 

(3.30) \S{B^\z)\= ^d) P r{d)\B^\. 

d\P(z) 

Hence, no loss is caused at (3.9) with the present specialization. The argument of the previous 
section could be repeated, and we get 

?(r) _m /-, , flogx 



(3.31) \SW>,z)\ = (1 + o(l))-<p r j ■ V(z, 1). 

Namely, the upper and lower bound implied by Rosser's Linear Sieve are in fact attainable. 

3.8 As an application of the above, we exhibit J.-R. Chen's theorem [15] 57 -*: For any suffi- 
ciently large even integer AT, we have 58 - ) 

(3,2) ,0, : N- P+ P 2 ,| > (l - j^) n (£) 

p>2 

with an absolute constant Co- Here P2 denotes an integer which has two prime factors at 
most. With no doubt, this famous assertion is at the pinnacle of the entire modern theory of 
the Sieve Method. 

Chen's plan of the proof is relatively simple. We first pick up any integer n < N such 
that (n, P(N 1 ^ 10 )) = 1, and consider the value of the expression 59 ) 

(3.33) W{n) = l-\ E 1 ~l E E L 

Pl \n Pl \n n=p lP2P3 

A rl/1 "<Pi<A rl/3 Af 1/10 <pi<N 1/3 A rl/3 <P2<(A f /pi) 1/2 
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We find readily that if W(n) > or W(n) > |, then n is a P 2 . Thus, with A = {N —p:p< 
N}, we have that 

|{p : N = p + P 2 }\ > \S(A, iV 1 / 10 )! -\ l^A^N 1 / 10 )] 
(3.34) N 1 / 10 <p 1 <N 1 / 3 

- I | |p < AT : AT = p + p lP2 p 3 , ATVio < pi < jyi/3 < p2 < (jV/pi) i/2 j | 

We apply (3.21) (r = 0) to the first term on the right, and (3.21) (r = 1) to the second. On the 
other hand, we replace the third term by ||<S(^4*, A 7 " 1 / 2 (log N)~ A )\ with A being sufficiently 
large, and apply (3.21) (r = 1) again. Here A* = {N — P1P2P3 '■ P1P2P3 < N} with pi,p2 as 
above. To bound the remainder terms that arise from the first two applications of Rosser's 
Linear Sieve, we employ Bombieri-Vinogradov's Mean Prime Number Theorem (2.38). To 
deal with the remainder term caused by the third application, we employ an extension 60 ) of 
(2.38) to the sequence {P1P2P3 < N} with pi,p2 as before. What remains is to compute the 
main terms, which is, however, rudimentary. 



Chapter 4. The Remainder Term 

4.1 From what we have described so far, one may infer the reach of the modern Sieve Method. 
To continue the story, we should now leave the discussion of the main terms for the estimation 
of the remainder terms 61 - ) . There must converge the true essences of Analytic Number Theory, 
as the proof of Chen's theorem (3.32) illustrates dramatically. We have, however, too many 
relevant fields, subjects, and technicalities to mention. Thus we would rather single out the 
idea that is probably the most fundamental in the theory of the remainder terms, especially 
of the Linear Sieve. A culmination in this context is due to H. Iwaniec [34], and to describe 
it we need to tell a brief history. 

4.2 The development started with the discovery that Selberg's sieve weights could intervene 
in the control of the remainder term, in a highly non-trivial way; the serendipity occurred to 
us [49]. This is in fact a surprising fact, because those weights had been constructed solely 
with the aim to attain the best possible main term while the remainder term had been utterly 
disregarded. That is, in the structure of the sieve weights thus defined is hidden a mechanism 
that could induce massive cancellations among the summands in the second sum of (1.20), 
r = 1. A little later the same occurred to Chen [16] with Rosser's Linear Sieve. 

4.3 Let us dwell a little on their findings. That is about the level of sieve weights, and 
we have to return to the concept itself. At the bottom is the too natural prerequisite that 
any main term be superior in magnitude to the corresponding remainder term. However, in 
the Sieve Method this triviality had never been achieved in any effective manner until Brun 
introduced the cutoff argument into Eratosthenes' Sieve, and brought about a revolutionary 
change. From there stemmed the concept of the level of sieve weights, as introduced at (1.21), 
though it left for long its trail only in the primitive 

(4.1) \R(A,z;p r )\< \Pr(d)\\Rd\. 

d<D 
d\P{z) 
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Having the sieve weights that yield optimal main terms, the focus of attention is naturally 
on the very basic issue: to get larger levels. That is essentially the unique way to make the 
inequality (1.19) sharper as the reasoning at the end of Section 3.4 suggests, despite it is 
meant only for Rosser's Sieve. Namely, to go beyond (4.1) the inner structure of the sequence 
{fi(d)p r (d)Rd} has to be exploited so that the cancellation among the members be detected; 
and the size of D, the level of p r , could be taken larger than a priori. 

This was done for the first time in [49]. Thus, let £l(p) = {0}, say, and write p(d)pi(d) = 
d 2 ]=d X(di)X(d2) with X(d) = for d > z. We assume that A is chosen optimally so that 
\X(d) \ < 1 as (2.20) shows. Then Selberg's Sieve or (2.11) implies that (4.1) (r = 1) could be 
replaced by the expression 



(4.2) 



\R(A, z; pi)\ < sup 



a, b 



^ ] ^ ] a mbnR[m,n] 



m<z n<z 



where a = {a m }, b = {b n } are arbitrary vectors such that \a m \, \b n \ < 1. This is yet trivial; 
but its implication is striking. For instance, applying (4.2) to the sequence A = {n < x : n = 
I (mod k)} ((/c, /) = 1), we obtain an improvement upon (2.36): 



(4.3) 



ii(x;kj) < 2(l + o(l)) 



x 



(f(k) \og(x/Vk) 



(k < x 6 / 17 ). 



That is, (4.2) allows us to utilize the level D = (x/Vk)(\ogx)~ 2 in place of the trivial 
D = {x/k){\ogx)~ 2 which is involved in (2.33). The cause of this is to have had a bilinear 
form in (4.2); that is, to have read the structure of those sieve weights as such. 

On the other hand Chen [16] exploited the Buchstab identity, in the case of the Linear 
Sieve. In (3.1) we put L J = z/zq with an integer J, divide the sum over the primes according 
to the covering [zq, z) = Uj<j/, / = \z§U~ x , zqU), and apply Rosser's Sieve to each S(A p ,p), 
p G /, with the level D /(zqU). Provided J is chosen appropriately, the pair of the main terms 
remains the same asymptotically, because of (3.18)-(3.19); but the remainder term is bounded 
by 



(4.4) 



sup sup 

K<z a, b 



^2 V(P n ) a P b nRpn 
K<p<KL n<D/K 



which is to be compared with (4.2). With this bilinear form, Chen could detect the can- 
cellation inside the remainder term. The effect is well exhibited in his own application that 
yielded the assertion 62 '' P 2 G [x — ^fx, x) for any sufficiently large x. In view of the relation 
between the Riemann Hypothesis and the existence of primes in short intervals, this is indeed 
remarkable. 

4.4 Now, after the two precursors 63 ) , Iwaniec [34] made a true incision into the subject. 
Superseding (4.2) and (4.4), he bounded the remainder term in (3.21) by the expression 



(4.5) 



(log z) ■ sup 



a, b 



^2 ^2 K mn ) a mb n R r 



m<M n<N 
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where M, N are arbitrary except for MN = D. This is called Iwaniec's bilinear form for the 
remainder term in the Linear Sieve. The basis of his idea is fairy simple; that simplicity is 
shared by the above two ideas similarly. Thus, in the process to reach Rosser's sieve weights 
(3.13), (3 = 2, those primes participating the sieve are classified as Chen did. The function r\ 
is first defined over the family of all set theoretic products of intervals /; and it is redefined 
as a function over integers, in an obvious manner according to their prime decompositions. 
With this, we proceed in much the same way as we did in Sections 3.2-3.4 and 3.6. Then a 
smoothed version of (3.21) emerges. What remains is solely Iwaniec's penetrating observation 
on the remainder term thus obtained 64 - ) . 

That the parameters M, N are independent is a real merit in Iwaniec's Linear Sieve, 
because of which (4.5) has given rise to many remarkable consequences. One of the best 
applications is done by Iwaniec and M. Jutila [36], a landmark among the works on the 
existence of primes in short intervals 65 - ) . 

Conclusion 

What (4.5) for instance suggests is the importance of the circle of methods, which are repre- 
sented by Linnik's Dispersion Method. The origin could be found in the Weyl-van der Corput 
method dealing with trigonometrical sums, which is a device closely related to subconvexity 
bounds of the Riemann zeta and analogous functions, though a far cry from the Lindelof 
Hypothesis. 

In those methods, especially in Linnik's, often Kloosterman sums play a fundamental 
role. This was the very reason why Linnik [44] envisaged the cancellation among the sums, 
which is perhaps hard to detect if one sticks to algebraic means only. Selberg [71] opened a 
way, and V.N. Kuznetsov [40] made a remarkable contribution to realize a part of Linnik's 
dream. Together with an independent research by R. W. Bruggeman [8] , the work [40] brought 
a new era in Analytic Number Theory. That was began by Iwaniec [32], when he combined 
their works with the additive Large Sieve and created the spectral Large Sieve. On that basis, 
Bombieri, J.B. Friedlander and Iwaniec [7] achieved a genuine improvement upon (2.38). 

On the other hand, the appearance of Kloosterman sums in the discussion of non-diagonal 
parts arising in the applications of the Dispersion Method or alike must be a reflection of the 
fact that we are actually working on a certain group structure 66 ), that is, we are looking at 
the remainder term in sieves via automorphic and harmonic mechanisms on GL(2). Bearing 
Brun's torch we have come to a far country. 

Addendum (May 14, 2005) 

Very recently there was a fantastic development in the study of gaps between primes: In their 
unpublished preprint "Small gaps between primes. II (preliminaryY (February 8, 2005), D.A. 
Goldston, J. Pintz, and C.Y. Yildirim established, among other things, 



(A.l) 



lim inf 



n^oo 



p n+1 - p n 
logPn 



and that if (1.26) holds for Q = x' 
such that 



with a 9 > \ then there exists an absolute constant c(6>) 



(A.2) 



liminf(p n+ i -p n ) < c(0), 

n— >oo 
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where p n is the nth prime. For a short, essentially self-contained proof of these facts, see the 
preprint arXiv:math. NT/0505300 'Small gaps between primes exist 1 by Goldston, Motohashi, 
Pintz, and Yildirim. The argument is in the framework of Selberg's Sieve and Linnik's Large 
Sieve; the latter is in the sense that (2.38) plays a fundamental role. One might surmise 
that a proof of Twin Prime Conjecture is not beyond the reach of today's Analytic Number 
Theory. 

Addendum (September 20, 2006) 

Because of the improvement [7] of the Mean Prime Number Theorem, the last hypothetical 
assertion on bounded differences between consecutive primes might appear within the reach of 
the present technology. Until recently there were, however, two main obstacles that prevented 
us to make any real incision into the matter: One is the fact that Selberg's Sieve or rather his 
sifting procedure, which is highly essential in the argument of Goldston, Pintz, and Yildirim, 
did not seem to admit any error terms with the flexibility that (4.5) enjoys, although we 
had already a partial result [62] that corresponds to the situation M = N in (4.5). Another 
obstacle is that in [7] only the distribution of primes in arithmetic progressions with a fixed 
residue class, i.e., rr(x; q, a) with a fixed, is considered, and to achieve (A. 2) we need to relax 
this restriction to a considerable extent. 

With this, the present situation is that although the latter difficulty still persists, Mo- 
tohashi and Pintz have succeeded in suppressing the former in their preprint arXiv:math. 
NT/0602599 l A smoothed GPY sieve'' by extending the argument of [62]. We note, by the 
way, that arXiv:math. NT/0505300 quoted above has been published in Proc. Japan Acad. 
82A (2006), 61-65, which can be downloaded freely via the web-page of the academy. 
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Notes 

1) My impression that has arisen reading occasionally the history of the ancient Meso- 
potamia. As to the remote origin of the concept of prime numbers and decomposition 
of integers into prime factors, there exist a collection of highly striking evidences from 
Sumer and Akkad; for instance, see the recent discovery [63] by K. Muroi. Ancient 
Babylonian mathematicians tried to make exercises for their students harder or more 
enjoyable by embedding not only the familiar 2, 3, 5 but also 

7, 11, 13, 17, 19, 23, 29, 31, 41, 47, 59, 79, 83, 137, 139, 1481 

into linear and quadratic equations as hidden prime factors of the coefficients. They 
were playing with prime numbers, more than 4000 years ago. To me those primes ap- 
pear exactly like glittering gems inlaid into the famous treasures from Ur, Nimrud, and 
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the grave of King Tutankhamen. Perhaps more than that, because those treasures are 
perishable but mathematical equations are never. Certainly this list of primes will be 
expanded in the near future as the research should develop further. 

2) Despite this common attribution, it appears likely that the procedure had come from the 
Orient. Ancient mathematicians made academic trips as we do today; an Ebla tablet 
tells such an episode (G. Leick. Mesopotamia. Penguin Books 2002, p. 68). Eratosthenes 
was a poet, astronomer, and director of the great library of Alexandria (3rd century 
BCE); he is said to have been old and unable to enjoy books when he died on voluntary 
starvation. 

3) There are two basic queries in the Sieve Method. One is the primality test, and the other 
the number of primes that satisfy a certain set of conditions. To the former, N. Agrawal 
et al made a remarkable contribution a few years ago. However, my interest lies mainly 
in the latter query, the quantitative aspect of the distribution of primes. To determine a 
spot to dig where a ruby could certainly be found is definitely harder than to determine 
whether a stone is a ruby or not. I am well aware that the word 'ineffectual' is too 
drastic, for Eratosthenes' Sieve could sometimes be a useful tool. A typical instance is 
in I.M. Vinogradov's proof (1937) of the ternary Goldbach Conjecture (see [64, Kap. VI] 
and R. C. Vaughan's newer argument [76]). See also H. Iwaniec [29]. 

4) Prior to Brun [10] was an incomplete trial by J. Merlin [45]. Brun mentioned this fact 
in [11] precisely; an indication of his respect for the academic axiom, which is often 
purposely ignored nowadays. Fortunately, Brun's name seems to be well-known. For 
instance, at the item 'Number Theory' in World Encyclopedia (Heibon, Tokyo 1972; 
Japanese), C. Chevalley writes 'the unique result known about the Twin Prime Conjec- 
ture is the work by Viggo Brun (1919)' (see the bottom lines of the right column on p. 
458, vol. 16). This is a bizarre opinion, however. Scripta manent. 

5) A.M. Legendre's formulation (1808), though the use of the Mobius function (1832) is a 
later tradition. 

6) This asymptotic formula is an elementary result due to F. Mertens (1874) (see [64, pp. 
80-81]); exp(-c s ) = 0.561459483566885 . . .. 

7) As to Prime Number Theorem, see, e.g., [26] [64]. Here the theorem is used in the abused 
manner that the probability of the appearance of a prime at x is l/(logx). 

8) Under the condition y = x e , < 6 < |, this appears impossible to prove even on the 
Riemann Hypothesis. See Section 2.8. 

9) The second assertion follows from the first, because of x/(\ogx) <C 7r(x), an elementary 
assertion due to P.L. Cebycev (1853). See [64, p. 19]. 

10) Is it an exaggeration to say that it took more than 2000 years for number theorists to 
break themselves of cleaving to equalities? A great king cut the knot with a stroke of his 
sword. 

11) See e.g., [20, §3.4] for details. 

12) This formulation of the sieve dimension might appear made abruptly. I have in fact 
skipped the discussion about how to define a mean value of u. See, for instance, [20, pp. 
27-37]. In practice, (1.17) should suffice. Note that there is no a priori sieve dimension 
for a given problem to which the Sieve Method is to be applied. It depends largely upon 
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the choice of the sequence A into which the problem is embedded. The two subsequent 
sections give examples in this context as well. 

13) Be aware that at this stage nothing is assumed about the remainder term. Obviously it 
is useless to employ sieve weights that make the estimation of the remainder term too 
hard. Although Brun [12] stated his result including an estimation of the remainder term, 
nowadays the main term is first treated, and then the discussion about the remainder 
term follows. A reason for this splitting of the theory into two parts could be seen from 
e.g., Section 1.9 below. 

14) It is proved in [12] that both of the conjectures could be resolved if one is allowed to 
replace primes by integers with at most 9 prime factors. See Section 3.8 below. 

15) About the distribution of primes in arithmetic progressions, see [64, Kap. IV]. As to the 
distribution of primes and the Riemann Hypothesis, see [64, p. 235] . If an unconditional 
uniform bound for individual E(x;kJ) is sought for, it is hard to supersede the famous 
but ineffective Siegel-Walfisz's Prime Number Theorem (see [64, p. 144]); namely, Q = 
(logx) B with any fixed B > is the best presently, and any assertion like (1.28) may 
appear hopeless. See Section 2.8, however. 

16) See e.g., [64, Kap. VI]. 

17) More generally, one may discuss with the system of residues Q(p a ) mod p a , a = 1, 2, . . ., 
being given initially. See Selberg [74] and also my lecture notes [60, §1.1] as well as [54, 
III]. 

18) Without loss of generality, one may suppose that < < p, as I do within the 
present chapter. In fact, if = 0, then such a prime does not participate the sieve 
problem under consideration; and if \Ci(j>)\ = p, then |«S(„4; Q, z)\ = as soon as z 
becomes larger than p. One could restrict the domain of Q instead. 

19) The restriction q < z is made solely for the sake of simplicity. In general, q < Q, q\P(z), 
should be employed, with Q depending on z. 

20) As a matter of fact, Linnik [41] dealt with the case in which q are all primes. His 
motivation was to investigate statistically I.M. Vinogradov's conjecture on least quadratic 
non-residues (an account can be found in [6, p. 7]). Thus the discussion of the present 
section is a refined version of Linnik's argument. The beautiful inequality (2.10) is in 
fact due to H.L. Montgomery [46, Chap. 3]. See also [54, II]. 

21) Selberg's argument is indeed rich; that will become apparent in Sections 2.9 and 4.3 
below. 

22) The estimation (2.20) is not optimal; neither the bound for \R\ in (2.22). In the case 
where |fi(p)| is bounded, these can be acceptable. However, for instance, if |0(p)| ~ cp 
with a constant c > 0, then (2.19) yields a bound of X(d) far superior to (2.21). In 
passing, we note that Selberg's Sieve in the context of the present section can be applied 
to general sequences upon the supposition (1.3); see e.g., [20, Chap. 2] for details. 

23) Probably because of this comparison, it is said often that Selberg's Sieve is effective only 
when \ft(p)\ is relatively small. This is, however, incorrect, as is proved in the next 
section. See also [60, §1.1] [54, II] [54, III]. 

24) The duality between Linnik's and Selberg's Sieves seems to have been observed for the 
first time by myself at a Turan seminar in 1970, as explicitly as is given here. Thus the 
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contents of this section dates back to 1970, although it was published in [54, II] much 
later because of an unfortunate circumstance. 

25) Selberg's inequality; obviously an extension of Bessel's inequality. Proof is easy; see e.g., 
[6, p. 14] [46, pp. 42-43]. The factor N - 1 + 5' 1 in (2.30) is due to Selberg, and is best 
possible, a proof of which is in [47]. The history of the Large Sieve, starting at Linnik 
[41] and reaching the contents of Section 2.6 (early 1970's), is highly interesting. Above 
all, the great leap made by Bombieri [4] should be acclaimed. That impact brought a 
number of then young people into Analytic Number Theory; some of them are still active. 
In between Bombieri [4] and Selberg's inequality is G. Halasz [21]. 

26) However, one should not forget the fact that Brun's Sieve is capable of giving rise to 
lower bounds as well; for instance, (1.23). 

27) See [60, pp. 129-130]. As to the relation between the distribution of primes in arithmetic 
progressions and the Exceptional Zeros or the Siegel Zeros of Dirichlet L-functions, see 
[64, Kap. IV]. 

28) This is due to Montgomery [46, Chap. 4]. There arbitrary intervals are in fact considered. 
More precisely, he used instead of (2.30) a sharper inequality deducible from (2.29). A 
completely uniform result was later obtained by Montgomery- Vaughan [48]. Here is an 
important note: Combining the discussion of Section 2.5 with (2.29), one concludes that 
(2.17) is not optimal. That is, it is suggested that the appeal to (2.29) should yield 
a procedure with which one could aim a simultaneous optimization for the main and 
remainder terms in Selberg's Sieve applied to intervals. Similar observation is made on 
[22, p. 126]. However, my older argument developed in Section 2.5 seems to go deeper. 

29) My usage of the word 'avoid' in the present context needs to be explained. It indicates, 
in a somewhat abused way, that there exist situations where one may reach significant 
results on the distribution of primes without appealing to the great hypothesis, though 
arguments might become more involved than on the hypothesis, and results less sharp. 
What is important is that the Twin Prime and the Goldbach Conjectures are contained 
in this category of problems. 

30) To learn under the two great mathematicians, A. Renyi and P. Turan, I arrived Budapest 
via railway in the evening of January 31, 1970. Next day, I saw a black flag over the 
entrance of Matematikai Kutatointezet on Realtanoda. The director Renyi died aged 49. 

31) The Zero Density Theory. See e.g., [26] [46] [64] for details. 

32) The multiplicative characters {n %t : t e R} could also be included into the multiplicative 
Large Sieve. Unfortunately, I have to skip basic contributions by Halasz [21] and P.X. 
Gallagher [18]. See [6] [26] [46] for details. 

33) The zero density theorem of the Linnik type. The discussion given in [64, Kap. X §2] is 
as hard as Linnik's original, though some simplifications by K.A. Rodosskii are claimed. 
H. Davenport reviewed Linnik's articles, and commented, 'formidable.' A comparatively 
simpler proof is that via Turan's Power Sum Method [75]; see [6, §6]. See also [50] [60, 
Chap. V] 

34) To understand this fascinating encounter between a sieve result and Dirichlet L-functions, 
I entered into the Sieve Method. The method could resolve a difficulty that does not 
appear to be settled with analytic arguments only. See [64, pp. 346-347]. The argument 
via the Power Sum Method requires as well the Brun-Titchmarsh theorem; see [6, p. 50]. 
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35) See [64, pp. 145-146]. Also [52]. 

36) Renyi neither proceeded nor stated his result like this. Nevertheless, what he established 
is essentially the same as the Mean Prime Number Theorem as is stated here. 

37) The developments prior to May 1964 are detailed in [2]; see [46, Chaps. 15-17] for an 
account of the later progress up to 1971. 

38) A.I. Vinogradov's assertion is weaker than Bombieri's which is embodied in (2.38). In 
the context of (1.25)-(1.27), they are, however, of the same strength essentially. 

39) The Dispersion Method is a device to introduce perturbations into certain arithmetic 
problems. Statistical arguments are then applied to the perturbed. Thus, for instance, 
binary additive problems could sometimes be transformed into ternary additive problems 
which are often more tractable. 

40) Bombieri used his own; but in the context of (2.38) there is no difference. See his lecture 
notes [6]. 

41) The present section is due to Selberg [73] and myself [50] [52]-[54] [60, §1.2— §1.4] . An 
origin is in [6, Theoreme 7 A] (Selberg), which corresponds to the case O(p) = {0}. 

42) More correctly, to minimize the main term of the asymptotic formula for (2.37). See 
Selberg [66] [67]. 

43) A proof is in [60, §1.2]; see also [54, II] [54, III]. Following Selberg [73], {^ g (n,0)} 
could be termed pseudo-characters, which could be regarded as generalizations of the 
Ramanujan sum. This relation between these arithmetic functions and Selberg's Sieve 
was found by myself [54, II] [60, Notes (I)]. 

44) This is just a formal discussion. In practice, we have to impose realistic conditions to /. 
See e.g., [53] [57, Lemma 2] [60, §1.3]. 

45) Due to myself [57]; see also [60, §6.2]. The two basic assertions of Linnik mentioned 
in Notes 36-38 and Turan's Power Sum Method are altogether discarded. See also 
[50] [52] [53]. These should be compared with M. Jutila [38] and S. Graham [19]. Further, 
the works [55] [56] [59] are relevant. 

46) Only a few people seem to have had opportunities to look into Rosser's unpublished 
manuscript. Other people, including myself, could see its outline only in Selberg's scant 
account [72]. Thus all published works either on Rosser's Sieve or on the lower sieve 
bound, which is implicitly the main subject of the present section, can be regarded as 
contributions made independently of Rosser. With this understanding, the first general 
result on the lower bound is in Ankeny-Onishi [1], which is a combination of Selberg's 
Sieve and Buchstab's Identity (3.1); see Note 23 above. Buchstab [14] is on the line of 
his [13], and certainly more sophisticated. The first published account of Rosser's Sieve 
is Iwaniec [33], entirely due to himself, a detailed version of which is given in [20, Chaps. 
3-4] . As to the Linear Sieve, see Note 52 below. 

47) This section is an excerpt from [60, §2.1]. See also [58, I]. 

48) An easy account of the Sieving Limit is in [60, pp. 57-60]. For more details see Selberg 
[72]; and also Note 61 below. 

49) These sets are independent of zq, z. 

50) This section is due to an observation by Friedlander-Iwaniec [17]. See also [60, pp. 55-59]. 
The sum over p in (3.16) can be estimated via (1.17). 
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51) In this respect, the argument originating in Ankeny-Onishi [1] has a definite merit. There 
exists a detailed discussion in [20, Chap. 7]. 

52) However, the first published account of the Linear Sieve, i.e., the determination of the op- 
timal upper/lower main terms is due to Jurkat-Richert [37]. They started with Selberg's 
Sieve and performed iteration via (3.1) in much the same way as Rosser's (i.e., (3 = 2). 
On the other hand Iwaniec [27] is the first published account of the Linear Sieve a la 
Rosser. Iwaniec [28] dealt with the half dimensional sieve, where Rosser's construction 
with 13 = 1 yields again the optimal upper/lower main terms. See also [20, §4.5]. By the 
way, the work of Jurkat and Richert was applied by J.-R. Chen in his famous work [15] 
on Goldbach's Conjecture; see Note 60 below. All works quoted here are independent of 
Rosser's. 

53) The appearance of this equation might be somewhat unexpected. It is, however, a 
consequence of the prerequisite that the pair of the optimal main terms in the upper 
and lower bounds be stable against the iteration via the Buchstab identity. By the way, 
(3.19) (r = 1) coincides with what Selberg's Sieve implies. Jurkat-Richert [37] starts 
with this fact. As to the analogue of (3.18)-(3.19) for general k, see [20, §4.2]. 

54) Here is the reason why the additional parameter zq has been introduced. The discussion 
in the rest of this section involves in fact a convergence issue, though it is not checked 
there. The role of zq is to secure the convergence. On the other hand, the appearance of 
the factor V(zq,u) is harmless because of Brun's theorem (1.22). In this context, (1.22) 
is termed the Fundamental Lemma. The assertion (3.21) results from the combination 
of Rosser's Sieve and Brun's Sieve (or rather the procedure of Section 3.5); the sieve 
weights of the former are multiplied by those of the latter, and the new sieve weights 
thus obtained are again values of a characteristic function. As to the condition d < z T , it 
is in fact the result of taking anew the value of z for a cosmetic purpose; this is possible 
because of the smoothness of <j> r . See [60, Chap. Ill] for more details. 

55) A heuristic explanation: Let (3q be the infimum thus defined. Assume that there exists 
a summand in the second sum on the right of (3.8) such that d < (Dfp(d)) 1 ^". Then 
there exists the possibility that \S(Ad,p(d))\ is positive, as this might be detected by 
Rosser's Sieve with j3 = fio and the level D/d. That is, a r (d) has to vanish; and we are 
led to the condition. 

56) Due to Selberg. See [20, §4.5] for details. 

57) Most probably, it was early in winter 1966. There was a colloquium talk by S. Uchiyama 
at Tokyo University. After his talk, I had a short discussion with him. He said, 'An 
astounding announcement has been made in China.' 'What is that?' 'A mathematician 
named Chen Jing-run has claimed p + P2 for Goldbach's Conjecture.' 'Anything about 
his method?' 'The Mean Prime Number Theorem and two sieve lemmas, but I have 
difficulties with one of the latter.' Then he gave me a copy of the now famous announce- 
ment. The dreadful Cultural Revolution had already been spreading, and Chen would 
lose seven years until the publication of the proof. 

58) Chen got C > 0.67. A conjecture of Hardy-Littlewood [23] states that with C = 2 the 
right side of (3.32) is asymptotically equal to \{p : N = p + p'}\, where p' is also a prime. 

59) The procedure (3.34) is a typical instance of weighted sieves, which originates in P. Kuhn 
[39]. See [20, Chap. 5] for a general theory. 
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Chen applied Jurkat-Richert [37] to the first two terms as mentioned above, and Selberg's 
Sieve to the third, which is not much different from the present procedure, as far as 
(3.32) is concerned. The move to the sequence A* is now called the Switching Trick. 
The extension of Bombieri-Vinogradov's Mean Prime Number Theorem to A* is due to 
Chen himself. For a more general extension, see [51] as well as [6, §22]. 
Naturally one could take up topics that are not included in the above category of sieves. 
For instance, there is a sieve method started by Bombieri [5], which is related to the 
elementary proof of the Prime Number Theorem ([64, Kap. Ill §6]). Recently Friedlander 
and Iwaniec (1998) made an important progress on this line. 

See [20, pp. 257-258]. 

The seminal nature of the works [16] and [49] could be stressed with a fair reason. 
There are explicit mentions in Iwaniec [31] [34] [35]. Certain personal recollections might 
be allowed here, as this is presumably the last opportunity for me to write down some 
memorable events from my young days: The preprint of [49] was finished in early autumn 
of 1972. There had been a belief in the air that the Brun-Titchmarsh theorem would not 
be improved beyond (2.36) via the Sieve Method. Therefore, I sent copies of my works 
to Chen, Gallagher, Halberstam, Hooley, and Richert. Hooley replied me immediately, 
kindly showing how to gain a further improvement. He had developed a statistical study 
of the Brun-Titchmarsh theorem. My preprint seems to have circulated widely, with 
misunderstandings as well; one day I received a letter from US informing me that I was 
rumored to have proved Twin Prime Conjecture. Richert kindly invited me to an MFO 
Tagung (1975). After attending the meeting, I went to Budapest to see my mentor 
Turan. He would die aged 66 in September next year. He indicated very faintly about 
his illness but no sign of graveness. He wholeheartedly encouraged me at the restaurant 
Astoria after my talk at the institute on my sieve results including the Brun-Titchmarsh 
theorem, the zero density of the Linnik type, and the Deuring-Heilbronn phenomenon. 
He liked all, especially the last, even though that made obsolete an important work of 
his already late collaborator S. Knapowski, and thus his Power Sum Method to a certain 
extent. I continued the trip to Warszawa via Krakow to see Iwaniec. He and A. Schinzel 
kindly came to pick up me at the central station in that early morning of cold December. 
Iwaniec had the ambition to extend [49] to Rosser's Sieve. At the next MFO Tagun (1977) 
he disclosed to me the surprise that he had already got the essentials of his revolutionary 
work [34]. In 1979 my daughter was born. In that summer I was at the great Durham 
Symposium, and D. Hejhal kindly gave me a private lecture on Kuznetsov's work [40], 
an enormous change of the landscape. There I met also L.-K. Hua; I told him I wanted 
to go to Peking to see Chen. Next autumn I could visit Peking but not Chen because of 
an unknown reason. In the spring of 1981, I finished my lecture notes [60] at the Tata 
IFR. I see still the magnificent sunset over the Arabian Sea. 

As a matter of fact, we need certain conditions to have the assertion (4.5) valid. See 
[58, II] [60, §2.3, §3.4] for the details; there a proof is developed, and it is precisely 
reproduced in [20]. By the way, it is possible to improve (4.2) into the form same as 
(4.5) but with M = N = \^D; see [62], which gives an improvement upon (4.3). See the 
second Addendum given above. 

This should be compared with M.N. Huxley [25]. Another striking application is Iwaniec 
[30], which stands for the hitherto best approximation to Gauss' conjecture on the exis- 
tence of primes of the form n 2 + 1. 
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66) See [61, §4.2] as well as [9]. 
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